Useful Links:
Environment Setting
# Import required packages
library(tidytransit)
library(tidyverse)
library(tmap)
library(ggplot2)
library(gtfsrouter)
library(here)
library(units)
library(sf)
library(leaflet)
library(tidycensus)
library(plotly)
setwd("D:/Dropbox (GaTech)/Work/Working/School/UA_2022/Lab/W3/")
Read GTFS Feed
Let’s download some GTFS feed provided by Metropolitan Atlanta Rapid Transit Authority (MARTA).
# From ULR
download_dest <- here::here(getwd(), "GTFS_MARTA.zip")
gtfs_local <- download.file("http://www.itsmarta.com/google_transit_feed/google_transit.zip", destfile = download_dest)
atl <- tidytransit::read_gtfs("http://www.itsmarta.com/google_transit_feed/google_transit.zip")
atl_temp <- gtfsrouter::extract_gtfs(download_dest)
## â–¶ Unzipping GTFS archive
✔ Unzipped GTFS archive
## Warning: This feed contains no transfers.txt
## A transfers.txt table may be constructed with the 'gtfs_transfer_table' function
## â–¶ Extracting GTFS feed
✔ Extracted GTFS feed
## â–¶ Converting stop times to seconds
✔ Converted stop times to seconds
# Take a look
summary(atl)
GTFS feed contains many relational tables about transit service schedules, trips, stops, and routes.
Where to get the URL?
Here are some other sources of GTFS feeds:
Transit agencies (e.g., MARTA) may provide their feeds on their website as well. For example, I found the data we are using today by Googling ‘MARTA GTFS’ and got it from this website (look for Official Download URL in the website).
Understand what’s inside atl
typeof(atl)
## [1] "list"
names(atl)
## [1] "agency" "calendar" "calendar_dates" "routes"
## [5] "shapes" "stop_times" "stops" "trips"
## [9] "."
head(atl$calendar)
## # A tibble: 6 × 10
## service_id monday tuesday wednesday thursday friday saturday sunday start_date
## <chr> <int> <int> <int> <int> <int> <int> <int> <date>
## 1 20 0 0 0 0 0 0 0 2022-04-23
## 2 28 0 0 0 0 0 0 0 2022-04-23
## 3 39 0 0 0 0 0 0 0 2022-04-23
## 4 24 0 0 0 0 0 0 0 2022-04-23
## 5 25 0 0 0 0 0 0 0 2022-04-23
## 6 26 0 0 0 0 0 0 0 2022-04-23
## # … with 1 more variable: end_date <date>
The atl object is a list. In it, names(atl)
shows that there are 9 dataframes. These dataframes can be linked to
each other using join keys. The diagram below shows how
different tables are linked.
IMAGE SOURCE: http://tidytransit.r-transit.org/articles/introduction.html
The table below shows a brief description of what each dataframe contains. This table is taken from Google.
| Filename | Defines |
|---|---|
| agency | Transit agencies with service represented in this dataset. |
| stops | Stops where vehicles pick up or drop off riders. Also defines stations and station entrances. |
| routes | Transit routes. A route is a group of trips that are displayed to riders as a single service. |
| trips | Trips for each route. A trip is a sequence of two or more stops that occur during a specific time period. |
| stop_times | Times that a vehicle arrives at and departs from stops for each trip. |
| calendar | Service dates specified using a weekly schedule with start and end dates. This file is required unless all dates of service are defined in calendar_dates.txt. |
| calendar_dates | Exceptions for the services defined in the calendar.txt. If calendar.txt is omitted, then calendar_dates.txt is required and must contain all dates of service. |
| fare_attributes | Fare information for a transit agency’s routes. |
| fare_rules | Rules to apply fares for itineraries. |
| shapes | Rules for mapping vehicle travel paths, sometimes referred to as route alignments. |
| frequencies | Headway (time between trips) for headway-based service or a compressed representation of fixed-schedule service. |
| transfer | Rules for making connections at transfer points between routes. |
| pathways | Pathways linking together locations within stations. |
| levels | Levels within stations. |
| feed_into | Dataset metadata, including publisher, version, and expiration information. |
| translations | Translated information of a transit agency. |
| attributions | Specifies the attributions that are applied to the dataset. |
GTFS into geospatial format
The function gtfs_as_sf converts ‘shapes’ and ‘stops’
tables in GTFS data into sf objects.
atlsf <- tidytransit::gtfs_as_sf(atl, crs = 4326)
head(atlsf)
## $agency
## # A tibble: 1 × 7
## agency_id agency_name agency_url agency_timezone agency_lang agency_phone
## <chr> <chr> <chr> <chr> <chr> <chr>
## 1 MARTA Metropolitan At… https://w… America/New_Yo… en 404-848-5000
## # … with 1 more variable: agency_fare_url <chr>
##
## $calendar
## # A tibble: 16 × 10
## service_id monday tuesday wednesday thursday friday saturday sunday
## <chr> <int> <int> <int> <int> <int> <int> <int>
## 1 20 0 0 0 0 0 0 0
## 2 28 0 0 0 0 0 0 0
## 3 39 0 0 0 0 0 0 0
## 4 24 0 0 0 0 0 0 0
## 5 25 0 0 0 0 0 0 0
## 6 26 0 0 0 0 0 0 0
## 7 27 0 0 0 0 0 0 0
## 8 29 0 0 0 0 0 0 0
## 9 31 0 0 0 0 0 0 0
## 10 2 0 0 0 0 0 0 0
## 11 3 0 0 0 0 0 1 0
## 12 4 0 0 0 0 0 0 1
## 13 5 1 1 1 1 1 0 0
## 14 21 0 0 0 0 0 0 0
## 15 33 0 0 0 0 0 0 0
## 16 34 0 0 0 0 0 0 0
## # … with 2 more variables: start_date <date>, end_date <date>
##
## $calendar_dates
## # A tibble: 4 × 3
## service_id date exception_type
## <chr> <date> <int>
## 1 34 2022-05-30 1
## 2 5 2022-05-30 2
## 3 29 2022-07-04 1
## 4 5 2022-07-04 2
##
## $routes
## # A tibble: 118 × 9
## route_id agency_id route_short_name route_long_name route_desc route_type
## <chr> <chr> <chr> <chr> <chr> <int>
## 1 16883 MARTA 1 Marietta Blvd/Jose… "" 3
## 2 16884 MARTA 2 Ponce de Leon Aven… "" 3
## 3 16886 MARTA 3 Martin Luther King… "" 3
## 4 16887 MARTA 4 Moreland Avenue "" 3
## 5 16889 MARTA 5 Piedmont Road / Sa… "" 3
## 6 16888 MARTA 6 Clifton Road / Emo… "" 3
## 7 16890 MARTA 8 North Druid Hills … "" 3
## 8 16891 MARTA 9 Boulevard / Tilson… "" 3
## 9 16892 MARTA 12 Howell Mill Road /… "" 3
## 10 16893 MARTA 14 14th Street / Blan… "" 3
## # … with 108 more rows, and 3 more variables: route_url <chr>,
## # route_color <chr>, route_text_color <chr>
##
## $shapes
## Simple feature collection with 367 features and 1 field
## Geometry type: LINESTRING
## Dimension: XY
## Bounding box: xmin: -84.67085 ymin: 33.4323 xmax: -84.08274 ymax: 34.10663
## Geodetic CRS: WGS 84
## First 10 features:
## shape_id geometry
## 1 100095 LINESTRING (-84.45052 33.81...
## 2 100096 LINESTRING (-84.41341 33.73...
## 3 100097 LINESTRING (-84.38689 33.77...
## 4 100098 LINESTRING (-84.31255 33.76...
## 5 100101 LINESTRING (-84.46945 33.75...
## 6 100102 LINESTRING (-84.36823 33.75...
## 7 100103 LINESTRING (-84.35409 33.75...
## 8 100104 LINESTRING (-84.35409 33.75...
## 9 100105 LINESTRING (-84.35498 33.68...
## 10 100106 LINESTRING (-84.36614 33.69...
##
## $stop_times
## # A tibble: 1,618,393 × 10
## trip_id arrival_time departure_time stop_id stop_sequence stop_headsign
## <chr> <time> <time> <chr> <int> <chr>
## 1 7142673 06:43 06:43 27 1 ""
## 2 7142673 06:46 06:46 28 2 ""
## 3 7142673 06:49 06:49 485 3 ""
## 4 7142673 06:50 06:50 470 4 ""
## 5 7142673 06:51 06:51 796 5 ""
## 6 7142673 06:52 06:52 797 6 ""
## 7 7142673 06:53 06:53 201 7 ""
## 8 7142673 06:55 06:55 798 8 ""
## 9 7142673 06:58 06:58 799 9 ""
## 10 7142673 07:00 07:00 204 10 ""
## # … with 1,618,383 more rows, and 4 more variables: pickup_type <int>,
## # drop_off_type <int>, shape_dist_traveled <dbl>, timepoint <int>
# Interactive mapping
tmap::tmap_mode('view')
## tmap mode set to interactive viewing
m1 <- tmap::tm_shape(atlsf$shapes) + tmap::tm_lines(alpha = 0.5)
m2 <- tmap::tm_shape(atlsf$stops) + tmap::tm_dots(id = 'stop_name', alpha = 0.5)
tmap::tmap_arrange(m1, m2, sync = T)